Language Independent Document Retrieval Using Unicode Standard
نویسندگان
چکیده
منابع مشابه
Content-Based Document Retrieval Using Natural Language
A system for the content-based querying of large databases containing documents of different classes (texts, images, image sequences etc.) is introduced. Queries are formulated in natural language (NL) and are evaluated for their semantic contents. For the document evaluation, a knowledge model consisting of a set of domain specific concept interpretation methods is constructed. Thus, the seman...
متن کاملDocument Image Retrieval Based on Keyword Spotting Using Relevance Feedback
Keyword Spotting is a well-known method in document image retrieval. In this method, Search in document images is based on query word image. In this Paper, an approach for document image retrieval based on keyword spotting has been proposed. In proposed method, a framework using relevance feedback is presented. Relevance feedback, an interactive and efficient method is used in this paper to imp...
متن کاملLanguage Model Expansion Using Webdata for Spoken Document Retrieval
In recent years, there has been increasing demand for ad hoc retrieval of spoken documents. We can use existing text retrieval methods by transcribing spoken documents into text data using a Large Vocabulary Continuous Speech Recognizer (LVCSR). However, retrieval performance is severely deteriorated by recognition errors and out-of-vocabulary (OOV) words. To solve these problems, we previously...
متن کاملUsing document clustering and language modelling in mediated information retrieval
Our work addresses a well documented problem: users are frequently unable to articulate a query that clearly and comprehensively expresses their information need. This can be attributed to the information need being too ambiguous and not clearly defined in the user’s mind, to a lack of knowledge of the domain of interest on the part of the user, to a lack of understanding of a retrieval system’...
متن کاملThe Unicode Standard, Version 6.2
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations have been printed with initial capital letters or in all capitals. Unicode and the Unicode Logo are registered trademarks of Unicode, Inc., in the United States and oth...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Computer Science and Information Technology
سال: 2014
ISSN: 0975-4660,0975-3826
DOI: 10.5121/ijcsit.2014.6413